Dividing by zero with SAS - myths and realities

15

Legendary unicorn surrounded by bubbles representing zeros

To those of you who have not read my previous post, Dividing by zero with SAS, it's not too late to go back and make it up. You missed a lot of fun, deep thought and opportunity to solve an unusual SAS coding challenge.

For those who have already read it, let’s get serious for a second.

When it comes to division by zero in a SAS DATA step, I have found a lot of misconceptions and misunderstandings percolated among SAS online communities and discussion groups. The goal of this post is to dispel all those fallacies and delusions, also known in civilized societies as myths.

Myth

When SAS encounters division by zero during program execution it generates an ERROR message in the SAS log and stops. (A more histrionic version of this myth is, “Your computer disconnects from the Internet and burns down.”)

Reality

SAS does not generate an ERROR message in the SAS log and does not stop executing the current step and subsequent steps in the event of division by zero.

Myth

When SAS encounters division by zero during program execution it generates a WARNING message in the SAS log and continues execution.

Reality

Consider yourself officially warned that SAS does not generate even a WARNING message in the SAS log in the event of division by zero.

When SAS encounters division by zero during SAS data step execution it does the following:

1. For each occurrence, it generates a NOTE in the SAS log:

NOTE: Division by zero detected at line XXX column XX.

2. For each occurrence, it sets the result of division by zero to a missing value and dumps the Program Data Vector (PDV) including data step variables and automatic variables (_N_ and _ERROR_) to the SAS log, e.g.

a=0 b=0 c=. _ERROR_=1 _N_=1

3. Finally, it generates a summary NOTE to the SAS log, e.g.

NOTE: Mathematical operations could not be performed at the following places. The results of the operations have been set to missing values.
Each place is given by: (Number of times) at (Line):(Column).
3 at 244:8

You can run the following SAS code snippet to confirm reality over myth for yourself:

data a;
   input n d;
   datalines;
2  0
-2 0
0  0
;
 
data b;
   set a;
   c = n/d;
run;

Myth

SAS programmers don’t care what messages SAS generates in the SAS log about division by zero and just ignore them.

Reality

Good programming practices and good SAS programmers do care about the possibility of dividing by zero and try to eliminate those messages by taking control over the situation. For example, instead of blindly dividing by a variable that can potentially have zero values, you can write the following “clean” code:

 
if d ne 0
   then r = n/d;
   else r = .;

In this case the division by zero event will never happen during program execution, and you will not receive any nastygrams in the SAS log, but still the result will be the same – a missing value.

Myth

To avoid those “Division by zero detected” NOTEs, one can use ifn() function as in the following example:

r = ifn(d=0, ., n/d);

The assumption is that SAS will first evaluate the first argument (logical expression d=0); if it is true, then return the second argument (missing value); if false, then evaluate and return the third argument (n/d).

Reality

Despite the ifn() function being one of my favorites, the reality is that it works somewhat differently than the above assumption – it evaluates all its arguments before deciding which argument to return. If d does in fact equal 0, evaluating the third argument, n/d, will trigger an attempt to divide by 0, resulting in the “Division by zero detected” NOTE and the PDV dump in the SAS log; that disqualifies this function from being a graceful handler of division by zero events.

Myth

SAS does not have an effective solution for graceful handling of division by zero events; therefore, SAS programmers are compelled to write additional special programming logic.

Reality

SAS doesn’t just have an effective solution to gracefully handle division by zero events. It has a perfect solution! That perfect solution is even conveniently named the divide() function.

The divide(n,d) function has two arguments, each is either a numeric constant, variable, or expression. It divides the first argument by the second argument and returns a non-missing quotient in cases when none of the arguments is missing and the second argument is not zero. Otherwise, it returns missing value. No ugly “Division by zero detected” NOTES in the SAS log, just what we want.

So, instead of using:

r = n/d;

you would just use

r = divide(n,d);

Moreover, it gives you extended functionality by providing additional information about its argument’s composition. In the case of a zero divisor, it returns three different types of missing values. If the dividend (the first argument) is positive, it returns a special missing value of .I (I for infinity); if the dividend is negative, it returns special missing value of .M (M for minus infinity); if dividend is zero, it returns . as an ordinary missing value.

The icing on the cake

SAS’ missing() function equates all those ordinary and special missing values:

missing(.) = missing(.I) = missing(.M) = 1.

If for some reason you don’t like having different missing values X in your results after using the X=divide(n,d) function, you can easily recode them all to the generic missing value:

if missing(X) then X=.;

Your Comments

How do you prefer handling division by zero? Please share with us.

Share

About Author

Leonid Batkhan

Leonid Batkhan is a long-time SAS consultant and blogger. Currently, he is a Lead Applications Developer at F.N.B. Corporation. He holds a Ph.D. in Computer Science and Automatic Control Systems and has been a SAS user for more than 25 years. From 1995 to 2021 he worked as a Data Management and Business Intelligence consultant at SAS Institute. During his career, Leonid has successfully implemented dozens of SAS applications and projects in various industries. All posts by Leonid Batkhan >>>

15 Comments

  1. I'm learning SAS, and I'm reminded of xkcd comic number 1172, because I was hoping to use my computer disconnecting from the internet and burning down to catch errors (in my case, I was using "if A then do this, else if B then do this, else 1/0" when all possible cases should be in A and B).
    I suppose it's good that I learned this now rather than later on when a division by zero would be bad and I'd have no idea that my very bad division by zero won't be caught, but still, sometimes you want a quick and dirty "disconnect from the internet and burn down"

    • Leonid Batkhan

      Haha! It's a good thing that you are learning SAS, however, division by 0 may not be bad either, just use divide() function instead of /. So instead of 1/0 you would write divide(1,0), then you can burn down your computer or else. Did you miss that function in the blog? And if you are a comic lover, you might like another blog Dividing by zero with SAS. Enjoy!

    • Leonid Batkhan

      Hm... Isn't it called data manipulation? I would say MATH is a higher authority than "your analytics", and MATH does not allow that. If your d<0 then MAX(d,0.001) will always return 0.001 which will make your d value irrelevant. Besides, "very small number" is a relative notion. What if your numerator is even smaller? NO, PLEASE DON'T DO THAT!

  2. Good article Leonid. Another thing about dividing by zero is that it can really slow down your program. I benchmarked this in the paper "Not dividing by zero: last of the low hanging efficiency fruit", at https://support.sas.com/resources/papers/proceedings13/325-2013.pdf.

    I re-ran today for Windows desktop and Linux 9.4TS1M4 with the same code as in the paper and got the following results (in the form real time / cpu time):

                           Linux             Windows     
    Divide by zero         26.72 / 26.69     45.64 / 45.59
    Avoid divide by zero    2.75 /  2.57      1.62 /  1.20
    

    • Leonid Batkhan

      Thank you, Bruce. Great point and great addition to this post. Thank you for sharing your benchmarking results, the numbers are really impressive. Windows is really bad at dividing by zero 🙂

    • Leonid Batkhan

      Hi Walt, thank you for your comment. Your way of handling it is better than nothing, but there are, however, some caveats.
      First, while your solution works for most cases it is hardly simpler than x = divide(n,d);
      Second, d>. does not guarantee that d is not missing. For example, if d=.y (special missing value) then your if condition d>. will miss this since .y is greater than . (ordinary missing value). If you submit the following code

      data _null_;
         n=5;
         d=.y;
         if d>. and d ^= 0 then x = n/d;
         put x=;
      run;
      

      you still will end up with the following message in your SAS log:

      x=.
      NOTE: Missing values were generated as a result of performing an operation on missing values.
            Each place is given by: (Number of times) at (Line):(Column).
      

      This is the same as not having your if statement at all and just have x = n/d;

      And third, using divide() function will not produce the above nasty-gram in SAS log even when the dividend is missing, instead it will quietly assign missing value to the result.

  3. Rick Wicklin

    Thanks for the overview of how the DATA step behaves. One clarification. the DATA step will display a WARNING when the number of NOTES exceeds a predetermined number. If the DATA step encounters many divisions by zero (20, by default), you will see many notes that look like
    NOTE: Division by zero detected at line ...
    and then you will eventually see following warning:
    WARNING: Limit set by ERRORS= option reached. Further errors of this type will not be printed.

    You can see the value of the ERRORS= system option by using

    proc options option=ERRORS value; run;

    and you can change the value by using the OPTIONS statement.

    It is also worth noting that your comments apply to the DATA step, not to all SAS languages and procedures. For example, in SAS/IML, dividing by zero results in a warning:
    WARNING: Division by zero, result set to missing value.
    In some SAS/STAT procedures that use the CMP subsystem (such as PROC NLMIXED), the procedure will abort execution and stop:
    NOTE: Execution error for observation 1.
    NOTE: The SAS System stopped processing this step because of errors.

    • Leonid Batkhan

      Thank you, Rick, for weighing in. All valid points that make excellent addition to this post. As you suggested, I have added clarification in the third paragraph regarding this post being about the DATA step programming.

  4. I would consider using a format (e.g. infbest.) to display infinities:

    proc format;
      value infbest(DEFAULT=32 FUZZ=0 MULTILABEL NOTSORTED)
      .I = "+Infinity"
      .M = "-Infinity"
      ._ - .Z = "Missing"
      other = [best32.]
      ;
    run;
    
    data test;
    input a b;
    format c infbest.;
    c = divide(a,b);
    cards;
    1 0
    1 1
    -1 0
    . 1
    . .
    . 0
    1 .
    0 .
    -1 .
    1 2
    -1 3
    0 4
    ;
    run;
    
    proc print;
    run;
    

    • Leonid Batkhan

      Thank you, Bart, great point! SAS formats are always useful. Knowing that symbol "∞" has Unicode representation of '221E' we can even write user-defined format to use "∞" for "Infinity":

      proc format;
        value infbest
        low-high = [best32.]
        .I = "+(*ESC*){unicode '221E'x}"
        .M = "-(*ESC*){unicode '221E'x}"
        other = "Missing"
        ;
      run;
      

Leave A Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Back to Top